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On Canonical Forms of Complete Problems via First-order 

Projections 



The class of problems complete for NP via first-order reductions is known to be 
characterized by existential second-order sentences of a fixed form. All such sentences 
are built around the so-called generalized IS-form of the sentence that defines Inde- 
pendentSet. This result can also be understood as that every sentence that defines 
a NP-complete problem P can be decomposed in two disjuncts such that the first one 
characterizes a fragment of P as hard as IndependentSet and the second the rest of 
P. That is, a decomposition that divides every such sentence into a a "quotient and 
residue" modulo IndependentSet. 

In this paper, we show that this result can be generalized over a wide collection of 
complexity classes, including the so-called nice classes. Moreover, we show that such 
decomposition can be done for any complete problem with respect to the given class, and 
that two such decompositions are non-equivalent in general. Interestingly, our results 
are based on simple and well-known properties of first-order reductions. 

Keywords: Finite Model Theory, Complexity Theory, First-Order Reductions, Canonical 
Forms 

1 Introduction 

Descriptive complexity studies the interplay between complexity theory, finite model theory 
and mathematical logic. Since its inception in 1974 |3j, descriptive complexity has been 
able to characterize all major complexity classes in term of logical languages independent of 
any computational model, thus suggesting that the computational complexity of languages 
is a property intrinsic to them and not an accidental consequence of our choice for the 
computational model. 

In descriptive complexity, problems are understood as sets of (finite) models which 
are described by logical formulae over given vocabularies. Reductions between problems 
correspond to logical relations between the set of models that characterize the problems. 
As important as the notion of polynomial many-one reductions in structural complexity, 
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there is the notion of first-order reductions in descriptive complexity, and among such, the 
first-order projections (fops). A fop is a very weak type of polynomial-time reduction whose 
study have provided interesting results such as that common NP-complete problems like 
Sat, Clique and others remain complete via fop reductions, and that such NP-complete 
problems can be described by logical sentences in a canonical form [21 [5] . 

In this paper we continue the study of the syntactic aspects of complete problems via fop 
reductions extending the work of Medina and Immerman [TIE]. In particular, we provide a 
general characterization of complete problems via fops for a large collection of complexity 
classes that cover well beyond just NP, including classes like P, PSPACE, and 11^, and 
others. Interestingly, our results rely on very general assumptions and tools already known 
in the field. 

The paper is organized as follows. In Sect. 2, we give standard definitions and known 
results which provide the theoretical framework of the paper and make it self contained. 
Sect. 3 contains our main result, namely the generalization of the Medina- Immerman result, 
together with relevant remarks and some examples. Later, Sect. 4 shows a general result 
about the existence of non-isomorphic problems via fop reductions, which implies that our 
canonical form is indeed minimal. Finally, Sect. 5 concludes with a brief summary and 
directions for future work. 

2 Preliminaries 

2.1 Logics, Finite Models, and Decision Problems 

A logical vocabulary is a tuple r = (R^ 1 , . . . , R^ r , c\ , . . . , c s , , . . . , / t r * } where the RjS are 
relational symbols of arity (Xj . Ci S 3X6 constant symbols, and the fkS are r^-ary functional 
symbols. A structure for r, also referred as r-structure or just structure if r is clear from 
context, is a tuple A = (\A\,Ri, . . . , R^, cf, . . . , cf, ff, . . . , ff 1 ) where 

• |*4 1 is the universe (or domain) of A, 

• Rf Q |-4| aj is a Oj-ary relation over \A\, 

• Cj € \A\ is an element of the universe, and 

• : \ A\ rk — > \A\ is a total r^-ary function over |_4|. 

For vocabulary r, STRUC[r] denotes the class of all finite structures, i.e. those whose 
universe is an initial segment {0, 1, . . . , n — 1} of N. 

In addition to above logical symbols, we also have the numerical relational symbols '=', 
'<', 'BIT' and 'sue', and constants '0' and 'max', which are assumed to belong to each 
vocabulary, and have fixed interpretations on every structure A: 

• = and < are interpreted as the usual equality and order on N, 

• A 1= BIT(i, j) iff the j-th bit in the binary representation of i is 1, 

• A \= suc(x, y) iff y is the successor of x in the usual order on N, and 

• and max denote the least and greatest element in \A\. 
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If C denotes a logic, the language jC[t] is the set of all well- formed formulae of C over the 
vocabulary r. A numerical formula in C[t] is a formula with only numerical symbols. For 
example, S03[r] is the set of all second-order formulae of form 3Qi ■ ■ ■ 3Q n <3? where the QiS 
are relational variables and $ is a first-order formula over over vocabulary r. As usual, FO 
denotes first-order logic and SO denotes second-order logic. 

A formula with no free variables is a sentence. For sentence ip G C[t], the class of all finite 
models that satisfy <p is denoted as MOD [<p\. For fixed r, it is possible to code every finite 
r-structures into a sequence of bits, i.e. a binary string, using a map MOD[r] ~^ {0,1}*. 
Hence, a collection of finite models can be represented as a collection of strings, or language. 

In descriptive complexity, a decision problem P is characterized by a subset of models 
from STRUC[r] for some fixed r. For example, the problem Clique can be characterized 
by structures A = (\A\,E A ,k A ) over the vocabulary r = (E 2 ,k), where E is a binary 
relational symbol and k is a constant, such that G = (\A\,E A ) makes up an undirected 
graph and k A G {0, . . . , \A\ — 1} denotes the size of a clique in G. Such models are typically 
characterized by a sentence \I / over some fragment C. The problem Clique, for example, 
can be characterized with a SOB sentence over r [3]; see below. 

2.2 First-Order Queries, Fops, and Duals 

Let r and a be two vocabularies where a = (R^ 1 , ■ ■ ■ , R% T , c%, ■ ■ ■ , c s ) has no functional 
symbols (from now on, we only consider vocabularies with no functional symbols). Let 
k > 1 and consider the tuple I = {(po, . . . , <p r , ipi, . . . ,ip s ) of r + s + 1 first-order formulae in 
FO[t] of form i^oOi, . . . ,x k ), <Pi(xi, . . .,x kai ) and tpj(xi, . . . , x k ) = {x x = c^A- • -/\x k = c' jk ) 
where the c'j.s are constant symbols from r (possibly with repetitions). That is, cpo has at 
most k free variables among x\, . . . , x k , <Pi has at most kai free variables among x±, . . . , x kai , 
and ipj denotes a tuple in {c' : d E r} k . 

Such tuple defines a mapping A 1(A), called a first-order query of arity k, from 
r-structures into a-structures given by: 

• the universe |/(»4)| = {(u\, . . . , u k ) G \A\ k : A 1= >po(u\, . . . ,u k )} is ordered lexico- 
graphically, 

• the relations are Rf A ' = {(ui, . . . ,u ai ) G |.A| fcai : A \= <fi(ui, . . . ,u ai )}, 

• the constants are = u for the unique u with A 1= ^o(^) A tpj(u). 

Furthermore, if for T C STRUC[r] and S C STRUC[<r], it is true that A G T iff 1(A) G S, 
then / is called a first- order reduction from T to S. 

A first-order query is called a first-order projection (fop) if tpo is numerical and each <pi 
has form: 

tpj(x) = a (x) V (a\(x) A \\(x)) V • • • V (a r (x) A X r (x)) 

where the a«s are numerical and mutually exclusive, and each Aj is a r-literal. Projections 
are typically denoted by the letter p. If p is a reduction from LT to we write p : LT < f op 
and if LT is complete via < j op reductions we say that II is < / op -complete. 

Projections have interesting properties. For example, projections are a special case of 
Valiant's non-uniform projections [10j . For our purposes, we are interested in the fact that 
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for each projection p there is a first-order sentence (5 P G FO[cr] that characterizes the image 
of p, i.e. B 1= (3 P iff B = p(A) for some A G STRUC[r]. The sentence j3 p is called the 
characteristic sentence of p pQ. 

Finally, there is a syntactic operator associated to a first-order query that plays a fun- 
damental role in our results. For a first-order query / : STRUC[r] — > STRUC[<r], the ohzal 
operator I maps formulae in £[<j] to formulae in C[t] in such a way that 



For a family T of proper complexity functions [9], we consider the complexity classes 
TIME(.F) = U/gjrTIME(/), and similarly for non-deterministic time and space. A com- 
plexity class C defined by T is nice if it has a universal problem of the form 



where L{Mi) € C and fi G T bounds the resources of Mj. Some well-known classes that are 
nice are L, NL, P, NP and PSPACE. Allender et al. [1] showed that if II is </ op -complete 
for a nice class C, then it is complete via injective fops of arity at least 2. The following 
properties are shown easily: 

Proposition 1 Let C be a complexity class defined by family T . Then, (a) if C is a 
deterministic class and T is closed under sums, then C is closed under finite unions; (b) 
if C is a nondeterministic class and T is such that for every f,g G T there is h G J- with 
f,g<h, then C is closed under finite unions; (c) if C is closed under finite unions and C 
is captured by logic L, then £ is closed under disjunctions. 

The nice classes L, NL, P, NP and PSPACE are known to be characterized by SO- 
DetKrom, SO-Krom, SO-Horn, S03 and SO+TC respectively [H[5]. Additionally, and 

are characterized by S03V ■ ■ - Qk and SOV3 • • • Q' k sentences where Qk = 3, Q' k = V if k 
is odd, and Qk = V, Q' k = 3 if k is even. Thus, by the proposition, all these logical fragments 
are closed under disjunctions, and also under conjunctions with first-order formulae. We 
will make use of these facts later. 

3 Canonical Forms of Complete Problems 

Medina and Immerman characterized </ op -complete problems for NP syntactically using 
the IndependentSet problem. This problem consists of checking whether an input graph 
G has an independent set of size k. IndependentSet is known to be complete for NP 
under different notions of reductions, and in particular, under fop reductions [7]. Indepen- 
dentSet in characterized by the following S03[t] sentence, for r = {E 2 , k): 



A N 1(6) if and only if 1(A) N 6 



for all G C[a] and A G STRUC[r] Sect. 3.2]. 



2.3 Complexity Classes 



Uc = {(Mi,uj, 1 } : Mi accepts u within fi(t) resources} 



tf/ S = (3/ G Inj)(Vx,y)[x + y A f x < k A f y < k ^E(x,y)] 



(1) 
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where '/ G Inj' means that / is a total and 1-1 function, i.e. an ordering of the elements of 
the universe, and f x denotes f{x). Although it seems that ([!]) quantifies over a functional 
variable, / is indeed a relational variable such that f x is the unique element such that 
f(x,f x ). The condition / G Inj is easily defined in first-order logic. Observe that the only 
second-order variable in (Q]) is / which is existentially quantified. 

Theorem 2 (|7j) Let L C STRUC[a] be a NP problem characterized by G £[cr] where 
a = (Q 1 ) is the vocabulary of binary strings. Then, a problem L is NP-complete via </ p 
reductions iff there is an infective fop p : STRUC[(E 2 , k)] — > STRUC[a] such that 

* = (flp A T/s) V A A) (2) 

where f3 p E FO[cr] is i/te characteristic sentence of p, T75 G 503 [cr] is a generalized IS-form 
J?)/, and A is a 503 [a] sentence. 

Intuitively, this result says that if sentence ^ characterizes a < j- op -complete problem L 
for NP, then it can be decomposed in two disjuncts ^ = \I/ is V ^> r est such that MOD \f$>is\ 
is < ^op-complete for NP and MOD[^ rest ] equals the "rest" of L which is not necessarily 
complete. 

Our main contribution is to show that above result can be generalized over a wide 
collection of complexity classes, including the nice classes, and that such decomposition can 
be done modulo any < j- op -complete problem for the given class. Moreover, we also show 
two such decompositions are not in general equivalent. 

The main obstacle for such generalization is to take care of the sentence Tjs for classes 
different than NP. As it will be shown, we do not have to consider each different class 
in isolation, since the corresponding T sentences will be the duals of the sentence \I/ that 
characterize the complete problem. 

Let us first define the relation =n over STRUCfr] with respect to a given problem 
II C STRUC[r]. For structures A and B, define 

A^nB iS (AeU^Ben). (3) 

Clearly, =n is an equivalence relation that partitions STRUC[r] into LT and its complement. 

By using dual operators and the equivalence relation, we are able to show the following 
generalization of Theorem [2j In the following, r and a refer to any two vocabularies. 

Theorem 3 (Main) Let C be a complexity class captured by fragment C closed under 
disjunctions and closed under conjunctions with FO. LetH C STRUC[t] be a <f op - complete 
problem for C characterized by Vl/ € C[t], and B a problem over vocabulary a. Then, B is 
< f op - complete for C if and only if there is a fop p : STRUC[t] — > STRUC[a] such that for 
all Be STRUC[a}: 

B £ B iff B^(f3 p A ?(*)) V A A) (4) 

where 



(a) f3 p € FOfcr] is the characteristic of p, i.e. B 1= /3 p iff B € p(STRUC[t]) , 
(6) A G C[cr], and 
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(c) / : STRUC[a] — > STRUC[t] is a first-order query such that for all A G STRUC[t], 
I(p{A)) = u A. 

Proof: For the necessity, assume that B is < j op -complete for C; i.e. B is characterized by 
some sentence A G C[a] and there is p : II </ D p B. For B G B we consider the two cases 
whether B G" p(STRUC[r]) or not. For the first case, B 1= ->/3 p A A. For the second case, 
B N (3 p and 

B = p(A) (for some A G STRUC[r] by (a)) 

isll (since p is reduction) 

=^ „4 1= f (f characterizes II) 

=> /(f»(-4)) 1= * (by condition (c)) 
p(i)NI(f) (def. of dual of I) . 

Therefore, BeB =^ B \= {[5 P A /(f)) V A A). Now, let £ G STRUC[<r] be such that 
B 1= (/? p A /(f)) V (^ p A A). If B N A, then B £ B. Otherwise, 

B \= (3 P A /(f) 

=► £ = p{A) and p(^) t= /(f) (for some .4 G STRUC[r]) 

=> I(p(A))\=V (def. of dual) 

=> (by(c)) 

==> *4 G II (f characterizes II) 

==> B G -B (since p is reduction) . 

It remains to show that there are first-order queries satisfying (c). Since II is complete, 
there is a fop / : STRUCjcr] -» STRUC[r] that reduces p(U) to n. Note that p(U) C £ 
since p is also a reduction. For A G STRUC[r], observe 

ien ^ p(^)Gp(n) /(p(i))en, 
i(p(A)) en ^ p(.4) g P (u) p(^) eB^ien. 

Thus, / : p(n) < /op n satisfies A G II iff I{p{A)) G II; i.e. .4 ^ n I(p(A)). 

For the sufficiency, assume there is a fop p : STRUC[r] — > STRUCfu] such that ([4]) holds 
for all ,B G STRUC [<r]. We need to show that B is complete for C. The inclusion B G C 
is direct from the closure properties on C. For the hardness, we show that p is indeed a 
reduction from II to B. For .4 G STRUC[r], we have p(A) \= f3 p . If A G II, then 

^ /(p(.A)) N * PG4) N /(f) =>> p(.A) G S . 

On the other hand, if p(-4) G /?, then 

p (A) t= /? p => p(.4) t= /(f) I(p(-4)) ^f => ^4^f ien. 

Thus, A G II iff p(^4) G B, p is a reduction, and -B is complete. □ 
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Corollary 4 The theorem holds if the first-order query I is the reduction I : p(TL) <f p n 
which exists since II is complete. 

Moreover, a first-order query J satisfying (c) is essentially equivalent (with respect to 
\P) to the reduction / : p(H) </ D p n. Indeed, for such J and a finite a-structure B = p(A) 
for A G STRUC[r], 

B N J(#) J(B) h f ^ .4 h ^ ^ 1(B) N$ ^ ?(#) . 

If we consider nice complexity classes, then the fop p can be assumed to be injective by 
a result of Allender et. al pQ. 

Corollary 5 For nice classes, the fop p : STRUC[t] — > .STiif/Cjo"] can be assumed to be 
injective. 

To see that Theorem [2] is equivalent to Corollary [5] when C = NP, let r = (E 2 ,k) and 
a = (Q 1 ) be the vocabularies for graphs and binary strings respectively, and consider a 
problem L C STRUC[cx] complete for NP characterized by ^l. According to Theorem [21 

* L = (/3p A Tjs) V A A) 

where p : IndependentSet — > L is a first-order projection and A is a SOB sentence. On 
the other hand, according to Corollary [5l ^> l also satisfies 

*L = (ftA/(%))V(^AA'). 

As shown before, T75 and /(vP is) are equivalent on p(STRUC[r]), and thus A and A' must 
be equivalent on STRUC[ct] n MOD[^/3]. 

3.1 Examples 

Consider Clique C STRUC[t = {E 2 , k)] characterized by the SOB sentence 

^CL = (3/ G Inj)(Vx, y) [x ± y A /, < k A f y < k E(x, y)) . 

For a = t, it is not hard to see that IndependentSet can be reduced to Clique using 
the fop p = \ xy {ipo,ipi,ip), of arity 1, where 

(p (x) = true, (fi(x,y) = ->E(x,y), ip(x) = (x = k) . 

Clearly, if A = (\A\,E A ,k A ), then \p{A)\ = \A\, E?W = \A\ 2 \ E A and k p ^ = k A . 
Therefore, p{p(A)) = A for all A G STRUC[r], and hence 

p(p{A)) G IndependentSet iff A G IndependentSet. 

Furthermore, (3 p = true and since Clique is also known to be NP-complete with respect 
to < fop reductions, we have 

*CL = (P P A p(^is)) V {pPp AT) = p^f IS ) . 
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Conversely, beginning with the observation tha (3 P = true and p(^is) = ^cl we can 
conclude, by Theorem [3l that Clique is < / op -complete for NP. We call this formulation 
of Clique as its canonical form with respect to IndependentSet. In this example, the 
formula ^>cl was already in its canonical form with respect to IndependentSet. 

For a second example, consider the problem SubGraphIso defined by tuples (G, G') 
such that the graph G contains a subgraph isomorphic to graph G' . Such tuples can be 
expressed with the vocabulary a = (F 2 ,H 2 ,k) where F and H define the edges of G 
and G' , and the constant k defines the initial segment {0, . . . , k — 1} for the edges of G'. 
Among other things, instances of SubGraphIso are identified with structures B in which 
H B C {0, . . . , k - l} 2 . SubGraphIso is defined by the S03 sentence ^ sg 

(3f G Inj)(Vx,y)[x ^ y A f x < k A f y < k ^ (H(f x ,f y ) -» F(x,y))] . 

A fop reduction p from Clique into SubGraphIso outputs (G, K k , k) on input (G, k). The 

fop is p = (ip , (px,ip 2 ,ip) given by 

(fo = true , tpx = E(x, y) , tp2 = (x < k Ay < k) , ip = (x = k) . 

The characteristic sentence of p is 

j3 p = x < k Ay < k — > F(x, y) . 

The reduction / : p(Clique) </ op Clique given by / = {(fo = true,(pi = F(x,y)) satisfies 
B G p(Clique) if and only if 1(B) G Clique for all B. Since ^ sg is equivalent to (f3 p A 
I{^cl)) V (->/?p A *sg), then, by Corollary El SubGraphIso is complete for NP via </ op 
reductions. 

Finally, other classes that satisfies the conditions of Corollary [5] are L, NL, P, PSPACE, 
and all and 11^. 

4 Non-Isomorphic Complete Problems for Nice Classes 

The next result is a more general version of one already known for NP [7]. The proof is 
analogous to the NP case. Among other things, it implies that we cannot get rid of the 
disjunction in Corollary 

Theorem 6 If C is a nice complexity class, then there are two C-complete problems that 
are not fop-isomorphic. 

Proof: Let T C {0, 1}* be a </ op -complete problem for C, and define T' = {uj0,ujI : u G T}. 
It is easy to see that V is complete via fops; e.g. define the projection p : STRUC[t = 
(S 1 )] STRUCfer = (T 1 )], of arity 2, as p = {ip {x,y),ipi(x,y)) where (f (x,y) = (x = 
0) V (x = 1 Ay = 0) gives the domain of p(A) and ipi(x, y) = [x = 0AS(y)) V (x = 1 Ay = 0) 
gives T p (-^\ Thus, for A with domain |^4| = {0, . . . , n — 1}, ipo defines 



\p(A)\ = {(0,2/) :0<y < n} U {(1,0)}. 
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Formula (pi identifies the n bits of A with the tuples (0, x) and assigns "value" 1 to the tuple 
(1,0). Observe that the order induced in p{A) is (0,0) < (0, 1) < • • • < (0, n - 1) < (1,0). 
Therefore, u 6 T iff p{ui) G r" which shows that V is complete. 

Since C is a nice complexity class, there is a fop p : STRUC[r] — > STRUC[<r] that is 
injective, of arity k > 2, that reduces V to V . We will show that p cannot be onto by 
showing that if uj 6 T, then either cjO £ p(r) or wl p(r). 

Consider the formula y(x) that defines the interpretation of T in the structure p(«4.) of 
form 

<p(x) = a (x) V (a\{x) A Ai(x)) V • • • V (a r (i) A A r (x)) . 

We are going to show wO £ p(r) wl p(r). Suppose that |u;0| = n + 1 and that 

ujO = p{u)') for some w'gT represented by the structure A. Each bit in ujO corresponds to a 
fe-tuple m.p(A), i.e. wO ~ uqU2 ■ ■ - u n where Uj is 1 iff u/ t= (p(iij). Since u n ~ 0, a;' ao(^n)- 
Consider the two cases whether u/ 1= ae(u n ) for some 1 < £ < r, or not. 

In the latter case, we can conclude that uj" Y- ag(u n ) for every uj" E {0, l}! 1 "! and 
1 < £ < r since ctg, being a numerical formula, obtains a value that only depends on the 
size of its input; thus, u>l $ p(r). 

In the former case, uj' \= ai{u n ), for some unique I, and ui' ' ¥ \g{u n ) since u n ~ 0. Thus, 
since A^(n n ) is a literal, some bit of uj' determines the value for u n . On the other hand, 
observe that 

w'er <=> p(uj') = uj0£T' <^> uj er 

where the first equivalence follows since p is a reduction, and the second by construction 
of r'. Furthermore, being p injective, implies that each bit in uj' determines one bit in uj. 
Therefore, there is a bit in uj' that determines two bits in ujO: one bit in uj and the rightmost 
0. If uj\ were in p(T), then the same bit in the preimage of uil would determine the same 
bit in uj and the rightmost 1, this time in an inconsistent manner. Therefore, ujI $ p(T). □ 

5 Conclusions 

We have extended the canonical form proposed by Medina and Immerman to all complexity 
classes characterized by fragments C closed under disjunctions, and under conjunctions with 
FO. Although, Medina and Immerman's method could be generalized to other nice classes 
beyond NP, it requires the formulation of "generalized" sentences. Our method, on the 
other hand, circumvent this problem by considering the dual operator. Additionally, it is 
not clear how Medina and Immerman's method could be used to find canonical forms with 
respect to problems that are not "graph" problems, or on classes that do not have complete 
problems based on explicit graphs, e.g. PSPACE. 

As for the near future, we are currently working on syntactic operators that preserve 
completeness via fops for general complexity classes. This subject is also addressed by 
Medina [6] where syntactic operators I : C[t] — > C[o~], that map formulae into formulae, 
are defined such that if characterizes a NP-complete problem, then so is We think 

that as inverse images play a fundamental role in (mathematical) analysis, inverse images 
of syntactic transformations are worth to explore. In our case, we look for operators / such 
that if defines a complete problem, then ^ also defines a complete problem; Nijjar also 



Logic and Computation Complexity, Wroctaw, Poland, July 15, 2007 



10 



mention that such transformations are worth exploring |8j . We believe that such operators 
could be use to establish completeness of problems in an easier way. 
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